Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Design of secondary indexes in HBase based on memory

CUI Chen, ZHENG Linjiang, HAN Fengping, HE Mujun

Journal of Computer Applications 2018, 38 (6): 1584-1590. DOI: 10.11772/j.issn.1001-9081.2017112777

Abstract （529）

PDF （1073KB）（348）

Save

In the age of big data, HBase which can store massive data is widely used. HBase only can optimize index for the rowkey and donot create indexes to the columns of non-rowkey, which has a serious impact on the efficiency of complicated condition query. In order to solve the problem, a new scheme about secondary indexes in HBase based on memory was proposed. The indexes of mapping to rowkey for the columns which needed to be queried were established, and these indexes were stored in memory environment which was built by Spark. The rowkey was firstly got by index during query time, then the rowkey was used to find the corresponding record quickly in HBase. Due to the cardinality size of the column and whether or not the scope query determined the type of index, and different types of indexes were constructed to deal with three different situations. Meanwhile, the memory computation and parallelization were used in Spark to improve the query efficiency of indexes. The experimental results show that the proposed secondary indexes in HBase can gain better query performance, and the query time is less than the secondary indexes based on Solr. The proposed secondary indexes can solve the problem of low query efficiency, which is caused by the lack of indexes of non-rowkey columns in HBase, and improve the query efficiency for large data analysis based on HBase storage.

Reference | Related Articles | Metrics